Identifying Long Haplotype Blocks with Low Diversity
نویسندگان
چکیده
Given an m×n haplotype matrix A, we show linear time algorithms for finding all interval diversities, farthest sites, and the longest block with low diversity. For selecting the multiple long blocks with diversity constraint, we show that selecting k blocks with longest total length can be be found in O(nk). We also propose linear time algorithms in calculating the all intra-longest-blocks and all intra-k-longestblocks. For dealing with missing SNP data, we propose methods that clarifies ambiguous SNP sites; furthermore, we show some hardness result concerning the minimum di-
منابع مشابه
Efficient Algorithms for SNP Haplotype Block Selection Problems
Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. Recent genetics research reveals that SNPs within certain haplotype blocks induce only a few distinct common haplotypes in the majority of the population. The existence of haplotype block structur...
متن کاملEfficient Haplotype Block Partitioning and Tag SNP Selection Algorithms under Various Constraints
Patterns of linkage disequilibrium plays a central role in genome-wide association studies aimed at identifying genetic variation responsible for common human diseases. These patterns in human chromosomes show a block-like structure, and regions of high linkage disequilibrium are called haplotype blocks. A small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in...
متن کاملHaplotype Block Partitioning and TagSNP Selection on Human Chromosome 21
A Single Nucleotide Polymorphism or SNP is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of species. Recent research reveals that SNPs within certain haplotype blocks induce only a few distinct common haplotypes in the majority of the population. The existence of haplotype block structures has serious implications for association-based methods...
متن کاملPatterns of Linkage Disequilibrium and Long Range Hitchhiking in Evolving Experimental Drosophila melanogaster Populations
Whole-genome resequencing of experimental populations evolving under a specific selection regime has become a popular approach to determine genotype-phenotype maps and understand adaptation to new environments. Despite its conceptual appeal and success in identifying some causative genes, it has become apparent that many studies suffer from an excess of candidate loci. Several explanations have...
متن کاملBlocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21.
Global patterns of human DNA sequence variation (haplotypes) defined by common single nucleotide polymorphisms (SNPs) have important implications for identifying disease associations and human traits. We have used high-density oligonucleotide arrays, in combination with somatic cell genetics, to identify a large fraction of all common human chromosome 21 SNPs and to directly observe the haploty...
متن کامل